Using Memory-Protection to Simplify Zero-copy Operations
نویسنده
چکیده
High performance networks (e.g. Infiniband) rely on zero-copy operations for performance. Zero-copy operations, as the name implies, avoid copying buffers for sending and receiving data. Instead, hardware devices directly read and write to application specified areas of memory. Since modern high-performance networks can send and receive at nearly the same speed as the memory bus inside machines, zero-copy operations are necessary to achieve peak performance for many applications. Unfortunately, programming with zero-copy APIs is a giant pain. Users must carefully avoid using buffers that may be accessed by a device. Typically this either results in spaghetti code (where every access to a buffer is checked before usage), or blocking operations (which pretty much defeat the whole point of zero-copy). We show that by abusing memory protection hardware, we can offer the best of both worlds: a simple zero-copy mechanism which allows for non-blocking send and receives while protecting against incorrect accesses. To make things concrete, consider an MPI computation that as part of it’s operation maintains a distributed table. Each worker might have code like this: // map from int to a big blob hash_map table; PutRequest put; GetRequest get; while (1) { if (world.Iprobe(ANY_SOURCE, PUT_REQUEST)) { world.Recv(peer, PUT_REQUEST, &put;, sizeof(put)); memcpy(table[put.key], put.value); } if (world.Iprobe(ANY_SOURCE, GET_REQUEST)) { world.Recv(ANY_SOURCE, GET_REQUEST, &get;, sizeof(get)); world.Send(peer, GET_RESPONSE, table[get.key]); } } (Naturally each worker will probably be doing something else in addition to maintaining their table.) This is fairly straightforward, but inefficient code; the blocking send operation prevents the worker from doing anything else until the current get request completes. We’d like to switch to use the non-blocking (ISend) primitive for effiency. A first attempt at this might look like: list pending; while (1) { ... if (world.Iprobe(ANY_SOURCE, GET_REQUEST)) { world.Recv(ANY_SOURCE, GET_REQUEST, &get;, sizeof(get)); pending.push(world.ISend(peer, GET_RESPONSE, table[get.key])); } // remove any finished sends check_for_completed(&pending;); Of course, after running this (or if we’re smart, before running it), we realize that there is a possible conflict between our put and get requests. We can now handle multiple simultaneous sends, but what if we
منابع مشابه
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers
Internet services like the world-wide web and multimedia applications like Newsand Video-on-Demand have become very popular over the last years. Due to the large number of users that retrieve multimedia data with high data rates concurrently, the data servers represent a severe bottleneck in this context. Traditional time and resource consuming operations like memory copy operations limit the n...
متن کاملEvaluation of a Zero-Copy Protocol Implementation
Internet services like the world-wide web and multimedia applications like Newsand Video-on-Demand have become very popular over the last years. Since a high and rapidly increasing number of users retrieve multimedia data with high data rates, the data servers can represent a severe bottleneck. Traditional time and resource consuming operations, like memory copy operations, limit the number of ...
متن کاملUsing Arbitrary Memory Regions for SCI Communication
The gap between the memory bandwidth and the bandwidth of modern high–performance interconnects is getting smaller. This increases the relative performance penalty for each copy operation on a data buffer to be transferred between two user processes on different nodes of a cluster. The optimal solution is direct data transfer between the source and destination buffers, without any in–between co...
متن کاملRevisiting Software Zero-Copy for Web-caching Applications with Twin Memory Allocation
A key concern with zero copy is that the data to be sent out might be mutated by applications. In this paper, focusing specially on web-caching application, we observe that in most cases the data to be sent out is not supposed to be mutated by applications, while the metadata around it does get mutated. Based on this observation, we propose a lightweight software zero-copy mechanism that uses a...
متن کاملMondriaan Memory Protection: Fine-Grained Protection with Translation
Introduction: Mondriaan memory protection (MMP) is a fine-grained protection scheme that allows multiple protection domains to flexibly share memory and export protected services. In contrast to earlier page-based systems, MMP allows arbitrary permissions control at the granularity of individual words. We use a compressed permissions table to reduce space overheads and employ two levels of perm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1304.0012 شماره
صفحات -
تاریخ انتشار 2013